-
Notifications
You must be signed in to change notification settings - Fork 395
Add strong mode for tree borrows #4586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Thank you for contributing to Miri! |
Thanks for the PR! However, it would have been better to ask first before just going ahead and implementing something. I have a student who was meant to work on this if there is time (though it's not clear yet whether there will be time). |
Not quite. There's also this logic here which has to do writes instead of reads: miri/src/borrow_tracker/tree_borrows/mod.rs Lines 376 to 395 in e5af344
I expect this will cause widespread issues... |
Sorry for not coordinating in advance. I thought there is no news on it since there were no comments from ~3 months ago that the issue was created, and I looked at open PRs and didn't find something related. I just liked to see progress on this, so I opened this PR. I didn't put too much effort on the implementation and tried to make the scope of this PR as small as possible to get fast feedback on the technical details, but didn't expect that it may interfere with your planned works. Feel free to close this if it is causing problems for your student.
Can you give me a test which should be UB if we want to use llvm's writeable attribute, but it is not in my implementation? From the given code I guessed it may have something to do with data races, but I added a data race test, and it is UB with current |
Yeah fair, there was no way for you to know. So, if this starts to become difficult and needs actual design work, I might hold off on this. For now, we can see how it goes. However as you noticed tests are already failing, because creating new
First of all, your approach allows trees where there are two I don't think there is any world in which we want to have "active" nodes without propagating an actual write through the entire tree. Do you have a particular motivation for this approach? In terms of examples, looking at your tests, it's actually quite interesting. The reason you get these behaviors is that when the function returns, an implicit write is performed on all locations that are "active". So the implicit write actually happens, but it happens oddly late (on function return, rather than already on function call). So examples that surprisingly have no UB would be examples that never return from the function, for instance: fn f(x: &mut i32) {
std::process::exit(0);
}
fn main() {
let mut a = &0;
let ptr = &raw const *a as *mut i32;
unsafe { f(&mut *ptr) };
} I don't know if this is actually problematic, but it is certainly surprising. The problematic examples are the ones that are UB now, and they are the reason why I think some deeper design work is required. I assume this is UB now: fn foo(x: &mut [i32]) {
assert!(x.len() >= 2);
unsafe {
let to = x.as_mut_ptr();
let from = x.as_ptr().add(1);
to.copy_from(from, 1);
}
}
foo(&mut [0, 1]); Stacked Borrows accepts this (thanks to a terrible hack). I don't think we can actually roll out an aliasing model that rejects this. (Worse, it remains UB even if you swap |
No, I thought doing a write and starting with active are almost equivalent, but now I understand that they are very different. I will change it to an actual write once we have decided on more important issues. The reason of my stubbornness is that I want to learn the reasons behind decisions. Thank you for your patience and detailed responses.
This one is interesting. I think the problem is that
Otherwise, eliminating the read is a wrong optimization, since a spurious write can happen in the The correct option I think is the option 3, since the For opting out of the
For function attribute, there are many choices, for example a simple What do you think? Do you agree with me that we need some way to opt-out from llvm's |
I am not suggesting to do an actual write, just to propagate the effects of a write through the tree. That's not quite the same thing.
SB doesn't make this code UB. |
Your example isn't UB, but I changed the order of |
Indeed, that's the usual answer here. (This has been discussed a few times in various t-opsem venues.) This requires a bunch of design work for figuring the best way to indicate which references allow spurious writes and which do not.
Indeed. But it's quite crucial that one of the two orders works at least. Ideally both should work.
I have no idea which example you are referring to.
That would probably end very similar to rust-lang/rust#142170. |
So, hard coding std functions doesn't help, since there are similar functions in the ecosystem. I have another idea for making an experiment. Instead of adding a new attribute and causing churn in the ecosystem, we can disable protectors for
The next step really depends on how missing |
I think In addition one of the primary problems with That being said, disabling |
If we're to do an experiment here I'd suggest to just disable the new "implicit write" behavior for inline functions, but keep the protectors (just make them reserved as before this PR). I think what we'd want is that
Ideally we'd suppress both of these retags. Then we fully avoid the issues discussed in rust-lang/unsafe-code-guidelines#450. But it's unclear how to do that since one of the retags happens in the caller... The weaker version of this just suppresses spurious writes, which is much easier: just retag the |
That would be a footgun if my hypothesis "noalias do not help inline functions anyway" is wrong. Let's test my hypothesis in rust-lang/rust#146823
Do you see a world where we omit the retags and have
Not touching the protectors makes sense to me, as we are not going to benefit from it anyway due to the caller's retag, and this is a |
Absolutely not, the |
This comment has been minimized.
This comment has been minimized.
48a0264
to
197e38b
Compare
This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
197e38b
to
9b29953
Compare
Based on the experiments in rust-lang/rust#146823, I ended up not filtering functions based on I added |
Yeah I might say that. ;) Also this should definitely include newtypes around raw pointers, in particular
I wasn't suggesting an "instead of", I meant that both should happen. But this might also work. |
I think at the end we need an attribute to control whether a function can have
How we should detect these newtypes?
I think it works, since the immediate write will make the state immediately Do you think a new experiment like the one in rust-lang/rust#142170 yields new results, now that the |
We can't use that heuristic/hack for What we can consider as a hack/experiment is to tie the decision of whether to create reserved or unique mutable references to a heuristic that looks for raw pointers.
That's an option, though it'd be the first time (outside of ABI guarantees) that we make For a first hack, just recursing below
We don't usually rely on the implicit read/write adjusting the state of the new node, that seems odd. What is wrong is setting the initial state to active without doing an implicit write to do the corresponding updates to everything else. But all the sanity checks we have (that failed in earlier versions of this PR) are written assuming there's only implicit reads, no implicit writes. It might be a bunch of work to fix them, and it's certainly a bunch of work to review any such change for correctness. It really boils down to what the point of this PR is. The goal of rust-lang/unsafe-code-guidelines#573 is to remove SB from Miri. But we won't remove SB in favor of a variant of TB with a heuristic like this. It also seem rather obvious that if we have an attribute to control whether functions immediately "activate" their mutable references, then there's a way to make existing code sound by adding the attribute in enough places. So... what's the new information we learn from playing around with this heuristic? |
If we continue with this, we need to remove But
Using this heuristic, we can find other problems of making tree borrows strong beside |
tree_borrows.perform_access( | ||
new_tag, | ||
Some((range_in_alloc, AccessKind::Write, AccessCause::Reborrow)), | ||
this.machine.borrow_tracker.as_ref().unwrap(), | ||
alloc_id, | ||
this.machine.current_span(), | ||
)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's already code a bit further up that does a write, that should be adjusted instead of adding a copy. We definitely don't want to do a read and a write.
It's perf neutral for rustc itself. That's not necessarily a representative benchmark of its effect on other Rust code.
Maybe we do, maybe we don't, that question is still open. But yeah as you say your PR doesn't remove the retag for any function so it'd be sound without any codegen changes. That wasn't clear to me from your description.
Because it's too ad-hoc to be an acceptable long-term solution in my eyes. We won't remove a model that could be acceptable in favor of one that isn't. We can use this to do a crater-at-home run if @saethlin has time, but I don't think we want to land something like this. |
do_a_write: start_mut_ref_on_fn_entry_as_active | ||
&& is_protected | ||
&& !current_function_involves_raw_pointers | ||
&& matches!(ref_mutability, Some(Mutability::Mut)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding a new field, this should set both freeze_perm
and nonfreeze_perm
to Active
.
That should then be used as a signal to control the AccessKind
for the (already existing) initial access logic in tb_reborrow
.
This PR adds a
-Zmiri-tree-borrows-strong
which tries to be what requested in rust-lang/unsafe-code-guidelines#573. Currently, I only implemented the bullet 3 (implicit writes on&mut
retags) to get feedback, but the plan is to eventually add flags for the other bullets under this-Zmiri-tree-borrows-strong
and retirestacked-borrows
by replacing it withtree-borrows-strong
.About the implementation, I didn't do an actual write, and instead set the initial permission to
Active
which I think has the same effect if I understand tree borrows correctly.If anyone wants to run this PR against the ecosystem crates, please check my implementation first. I wrote a test which is passing, but I'm not sure about the corner cases.